Goto

Collaborating Authors

 intelligent agent


Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects

Zall, Raziyeh, Kheyrkhah, Alireza, Cambria, Erik, Naseri, Zahra, Kangavari, M. Reza

arXiv.org Artificial Intelligence

Developing intelligent agents that possess human-level intelligence is a key goal in the field of human-computer interaction (HCI) and general artificial intelligence[2]. A crucial aspect of achieving this goal is the incorporation of emotional intelligence, which is essential for human cognition and social interaction, into these intelligent agents. Emotional intelligence encompasses three interrelated capabilities: 1) emotion understanding, which involves accurately detecting and understanding affective signals, such as recognizing individuals' emotional states during interactions; 2) emotion elicitation and experiences, which refers to interpreting the causes, context, and implications of emotions for both the individual and the interaction; and 3) emotion expression, which encompasses the capacity to generate, modulate, and convey appropriate emotional responses in a socially meaningful manner. Affective Computing, coined by Rosalind Picard [1], emerged as a discipline dedicated to equipping machines with emotional intelligence, enabling them to recognize, interpret, and respond to human emotions. By embedding emotional intelligence into intelligent agents, affective computing facilitates more naturalistic, adaptive, and socially competent interactions, which in turn enhances user trust, engagement, and satisfaction [209]. Such emotionally intelligent systems not only improve usability but also enable advanced functionalities, including personalized assistance, empathetic dialogue, and context-aware decision-making. In Figure 1, an overview of the emotional intelligence capabilities in intelligent agents is presented. The process of emotional intelligence begins with analyzing the emotional aspects of the user input, enabling the agent to identify the user's affective state during interactions [259][306]. The next step is affective cognition, where the agent evaluates the observed emotional events using cognitive mental states to ensure accurate interpretation.


VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-Resource Languages

Atuhurra, Jesse, Ali, Iqra, Iwakura, Tomoya, Kamigaito, Hidetaka, Hiraoka, Tatsuya

arXiv.org Artificial Intelligence

Vision Language Models (VLMs) are pivotal for advancing perception in intelligent agents. Yet, evaluation of VLMs remains limited to predominantly English-centric benchmarks in which the image-text pairs comprise short texts. To evaluate VLM fine-grained abilities, in four languages under long-text settings, we introduce a novel multilingual benchmark VLURes featuring eight vision-and-language tasks, and a pioneering unrelatedness task, to probe the fine-grained Visual and Linguistic Understanding capabilities of VLMs across English, Japanese, and low-resource languages, Swahili, and Urdu. Our datasets, curated from web resources in the target language, encompass ten diverse image categories and rich textual context, introducing valuable vision-language resources for Swahili and Urdu. By prompting VLMs to generate responses and rationales, evaluated automatically and by native speakers, we uncover performance disparities across languages and tasks critical to intelligent agents, such as object recognition, scene understanding, and relationship understanding. We conducted evaluations of ten VLMs with VLURes. The best performing model, GPT-4o, achieves an overall accuracy of 90.8% and lags human performance by 6.7%, though the gap is larger for open-source models. The gap highlights VLURes' critical role in developing intelligent agents to tackle multi-modal visual reasoning.


Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics

Jin, Sheng, Wang, Haoming, Gao, Zhiqi, Yang, Yongbo, Chunjia, Bao, Wang, Chengliang

arXiv.org Artificial Intelligence

Large language models (LLMs) based Agents are increasingly pivotal in simulating and understanding complex human systems and interactions. We propose the AI-Agent School (AAS) system, built around a self-evolving mechanism that leverages agents for simulating complex educational dynamics. Addressing the fragmented issues in teaching process modeling and the limitations of agents performance in simulating diverse educational participants, AAS constructs the Zero-Exp strategy, employs a continuous "experience-reflection-optimization" cycle, grounded in a dual memory base comprising experience and knowledge bases and incorporating short-term and long-term memory components. Through this mechanism, agents autonomously evolve via situated interactions within diverse simulated school scenarios. This evolution enables agents to more accurately model the nuanced, multi-faceted teacher-student engagements and underlying learning processes found in physical schools. Experiment confirms that AAS can effectively simulate intricate educational dynamics and is effective in fostering advanced agent cognitive abilities, providing a foundational stepping stone from the "Era of Experience" to the "Era of Simulation" by generating high-fidelity behavioral and interaction data.



Agentic-AI Healthcare: Multilingual, Privacy-First Framework with MCP Agents

Shehab, Mohammed A.

arXiv.org Artificial Intelligence

Abstract--This paper introduces Agentic-AI Healthcare, a privacy-aware, multilingual, and explainable research prototype developed as a single-investigator project. The platform integrates a dedicated Privacy & Compliance Layer that applies role-based access control (RBAC), AES-GCM field-level encryption, and tamper-evident audit logging, aligning with major healthcare data protection standards such as HIPAA (US), PIPEDA (Canada), and PHIPA (Ontario). Example use cases demonstrate multilingual patient-doctor interaction (English, French, Arabic) and transparent diagnostic reasoning powered by large language models. As an applied AI contribution, this work highlights the feasibility of combining agentic orchestration, multilingual accessibility, and compliance-aware architecture in healthcare applications. This platform is presented as a research prototype and is not a certified medical device. This paper presents a working prototype that integrates agentic orchestration via the Model Context Protocol (MCP), field-level encryption, and multilingual LLM agents into a single compliance-aware stack for healthcare.


The Agent Behavior: Model, Governance and Challenges in the AI Digital Age

Zhang, Qiang, Yan, Pei, Xu, Yijia, Fu, Chuanpo, Fang, Yong, Liu, Yang

arXiv.org Artificial Intelligence

Advancements in AI have led to agents in networked environments increasingly mirroring human behavior, thereby blurring the boundary between artificial and human actors in specific contexts. This shift brings about significant challenges in trust, responsibility, ethics, security and etc. The difficulty in supervising of agent behaviors may lead to issues such as data contamination and unclear accountability. To address these challenges, this paper proposes the "Network Behavior Lifecycle" model, which divides network behavior into 6 stages and systematically analyzes the behavioral differences between humans and agents at each stage. Based on these insights, the paper further introduces the "Agent for Agent (A4A)" paradigm and the "Human-Agent Behavioral Disparity (HABD)" model, which examine the fundamental distinctions between human and agent behaviors across 5 dimensions: decision mechanism, execution efficiency, intention-behavior consistency, behavioral inertia, and irrational patterns. The effectiveness of the model is verified through real-world cases such as red team penetration and blue team defense. Finally, the paper discusses future research directions in dynamic cognitive governance architecture, behavioral disparity quantification, and meta-governance protocol stacks, aiming to provide a theoretical foundation and technical roadmap for secure and trustworthy human-agent collaboration.


Agent Guide: A Simple Agent Behavioral Watermarking Framework

Huang, Kaibo, Zhang, Zipei, Yang, Zhongliang, Zhou, Linna

arXiv.org Artificial Intelligence

The increasing deployment of intelligent agents in digital ecosystems, such as social media platforms, has raised significant concerns about traceability and accountability, particularly in cybersecurity and digital content protection. Traditional large language model (LLM) watermarking techniques, which rely on token-level manipulations, are ill-suited for agents due to the challenges of behavior tokenization and information loss during behavior-to-action translation. To address these issues, we propose Agent Guide, a novel behavioral watermarking framework that embeds watermarks by guiding the agent's high-level decisions (behavior) through probability biases, while preserving the naturalness of specific executions (action). Our approach decouples agent behavior into two levels, behavior (e.g., choosing to bookmark) and action (e.g., bookmarking with specific tags), and applies watermark-guided biases to the behavior probability distribution. We employ a z-statistic-based statistical analysis to detect the watermark, ensuring reliable extraction over multiple rounds. Experiments in a social media scenario with diverse agent profiles demonstrate that Agent Guide achieves effective watermark detection with a low false positive rate. Our framework provides a practical and robust solution for agent watermarking, with applications in identifying malicious agents and protecting proprietary agent systems.


Agentic AI and Multiagentic: Are We Reinventing the Wheel?

Botti, V.

arXiv.org Artificial Intelligence

The terms Agentic AI and Multiagentic AI have recently gained popularity in discussions on generative artificial intelligence, often used to describe autonomous software agents and systems composed of such agents. However, the use of these terms confuses these buzzwords with well-established concepts in AI literature: intelligent agents and multi-agent systems. This article offers a critical analysis of this conceptual misuse. We review the theoretical origins of "agentic" in the social sciences (Bandura, 1986) and philosophical notions of intentionality (Dennett, 1971), and then summarise foundational works on intelligent agents and multi-agent systems by Wooldridge, Jennings and others. We examine classic agent architectures, from simple reactive agents to Belief-Desire-Intention (BDI) models, and highlight key properties (autonomy, reactivity, proactivity, social capability) that define agency in AI. We then discuss recent developments in large language models (LLMs) and agent platforms based on LLMs, including the emergence of LLM-powered AI agents and open-source multi-agent orchestration frameworks. We argue that the term AI Agentic is often used as a buzzword for what are essentially AI agents, and AI Multiagentic for what are multi-agent systems. This confusion overlooks decades of research in the field of autonomous agents and multi-agent systems. The article advocates for scientific and technological rigour and the use of established terminology from the state of the art in AI, incorporating the wealth of existing knowledge, including standards for multi-agent system platforms, communication languages and coordination and cooperation algorithms, agreement technologies (automated negotiation, argumentation, virtual organisations, trust, reputation, etc.), into the new and promising wave of LLM-based AI agents, so as not to end up reinventing the wheel.


Position Paper: Bounded Alignment: What (Not) To Expect From AGI Agents

Minai, Ali A.

arXiv.org Artificial Intelligence

--The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger . The emergence of extremely large and capable generative models has led to alarming predictions and created a stir from boardrooms to legislatures. As a result, AI alignment has emerged as one of the most important areas in AI research. The goal of this position paper is to argue that the currently dominant vision of AGI in the AI and machine learning (AI/ML) community needs to evolve, and that expectations and metrics for its safety must be informed much more by our understanding of the only existing instance of general intelligence, i.e., the intelligence found in animals, and especially in humans. This change in perspective will lead to a more realistic view of the technology, and allow for better policy decisions. The most successful AI systems today, such as large language models (LLMs) [1]-[5], are based on a computation-alist, statistical, and decision-theoretic paradigm rather than a biological one. As these systems scale up in size, they are improving their performance in areas such as reasoning [6]- [9], and becoming more multimodal [10]-[14]. AI agents [15]- [17], including physical ones [18]-[20], are also becoming increasingly capable. With these rapid advances, there is an expectation that powerful systems with artificial general intelligence (AGI) may soon be at hand. Through all this, there is a general desire that AGI must remain subject to human control and intervention, and must exist only to serve human needs (see, for example, the discussion in [21]). There is also great concern that increasingly powerful AGI systems with autonomous agency might pose serious risks, including existential ones [22]-[27], which has led to a focus on AI alignment, i.e., making AI systems consistent with human norms and preferences [28], [29]. The main position argued in this paper is that: 1) General intelligence should be seen in terms of its archetype: The intelligence of living agents; and 2) The goal of building powerful AGI agents is fundamentally inconsistent with the expectation of complete alignment or near-total control of AGI agents by humans even in principle .


Metacognition in Content-Centric Computational Cognitive C4 Modeling

Nirenburg, Sergei, McShane, Marjorie, Oruganti, Sanjay

arXiv.org Artificial Intelligence

For AI agents to emulate human behavior, they must be able to perceive, meaningfully interpret, store, and use large amounts of information about the world, themselves, and other agents. Metacognition is a necessary component of all of these processes. In this paper, we briefly a) introduce content-centric computational cognitive (C4) modeling for next-generation AI agents; b) review the long history of developing C4 agents at RPI's LEIA (Language-Endowed Intelligent Agents) Lab; c) discuss our current work on extending LEIAs' cognitive capabilities to cognitive robotic applications developed using a neuro symbolic processing model; and d) sketch plans for future developments in this paradigm that aim to overcome underappreciated limitations of currently popular, LLM-driven methods in AI.